236 ◾ Bioinformatics
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
The above annotation data package is for human (Homo sapiens) data from UCSC build
hg19 based on the knownGene Track. The available annotation packages for the genomes of
other organisms are available at “http://bioconductor.org/packages/3.5/data/annotation/”.
In the next step, we will use “annotatePeak()” function from ChIPseeker package[9] to
annotate the peaks by associating them to the nearest genes. This function also provides
the option “tssRegion=” that allows us to specify a max distance from the TSS in which the
peaks can be associated to the gene.
annotated_peaks <- lapply(bedfiles,
annotatePeak,
TxDb=txdb,
tssRegion=c(-1000, 1000), verbose=FALSE)
annotated_peaks
The above R codes apply the “annotatePeak()” function to annotate the peaks in the peak
signal files. The peak region was also set to any distance in the range (−1000, 1000) from
the TSS of the gene. Figure 6.13 shows the annotation summaries for the three samples.
The summary includes the number of peaks annotated on the top and then the peak anno-
tation frequencies based on the genomic features (gene regions). We can notice that the
maximum frequencies are in the promoter region, which in this case is an indication for
the transcriptional activity in the gene associated to the peaks.
The ChIPseeker package provides several functions to visualize the annotated peaks. The
“plotAnnoBar()” creates a bar chart for the peak representation in the different genomic
regions (features).
plotAnnoBar(annotated_peaks)
Figure 6.14 shows a bar plot that depicts peak enrichment representation in the different
genomic regions of the genes. We can notice that most peaks are centered in the promoter
regions. This may look different if the ChIP-Seq is for TFs or histone marks.
Distribution of peaks relative to TSS:
The sites of TF binding and Poly II localization are found in the promoter regions of the
genes. Thus, distribution of peaks around TSS will give an idea about the activity of the
FIGURE 6.13 Annotation summaries for three ChIP-Seq samples.